Back

in silico Plants

Oxford University Press (OUP)

Preprints posted in the last 90 days, ranked by how well they match in silico Plants's content profile, based on 24 papers previously published here. The average preprint has a 0.02% match score for this journal, so anything above that is already an above-average fit.

1
OpenAlea.HydroRoot: A modelling framework to dissect, predict and phenotype branched root hydraulic architecture

Bauget, F.; Ndour, A.; Boursiac, Y.; Maurel, C.; Laplaze, L.; Lucas, M.; Pradal, C.

2026-03-23 plant biology 10.64898/2026.03.19.713025 medRxiv
Top 0.1%
32.9%
Show abstract

Drought is a significant factor in agricultural losses, making it imperative to understand how root system architecture (RSA) adapts to environmental condition like water deficit. HydroRoot is a functional-structural plant model (FSPM) aimed at analyzing and simulating hydraulic and solute transport of RSA. The model integrates a static hydraulic solver, a coupled water-solute transport solver, a statistical generator of RSA based on Markov model, and a dynamic hydraulic model accounting for root growth. This paper presents the model, the mathematical description of the formalism of solvers, and use cases with their associated tutorials. Five use cases illustrate capabilities of HydroRoot, which has been successfully used for phenotyping root hydraulics across various species, including Arabidopsis, maize, and millet. The model-driven phenotyping method "cut and flow" is presented to characterize axial and radial conductivities on a given root genotype. Finally, three step-by-step tutorials provide a structured way to learn how to use HydroRoot 1) to simulate hydraulic on a given architecture, 2) to simulate water and solute transport on a maize root, and 3) to simulate hydraulic on two pearl millet genotypes with varying soil conditions. Hydroroot is an open-source package of the OpenAlea platform, with the code publicly available on Github. A comprehensive documentation is available with a reproducible gallery of examples.

2
Dissecting the Network Architecture of a Plant Circadian Clock Model: Identifying Key Regulatory Mechanisms and Essential Interactions

Singh, S. K.; Srivastava, A.

2026-03-18 systems biology 10.64898/2026.03.15.711848 medRxiv
Top 0.1%
22.8%
Show abstract

Circadian rhythms are self-sustained biological oscillations that coordinate diverse physiological processes in plants, including growth, metabolism, and environmental responses. These rhythms arise from an interconnected transcriptional translational feedback network that integrates multiple entrainment cues such as light and temperature. The plant circadian clock is organized around key regulatory loops involving CCA1, LHY, PRRs, TOC1, ELF4, LUX, and other transcriptional regulators, whose coordinated interactions ensure precise and robust oscillations. In this study, we developed an ordinary differential equation based mathematical model, building upon a previous framework to incorporate additional regulatory modules and transcriptional controls that better reflect experimentally observed behaviour. To elucidate the regulatory organization of this model, we performed a multi-layered computational analysis combining four complementary approaches: (i) period sensitivity analysis to quantify how parameter perturbations influence the systems timing, (ii) phase portrait analysis to visualize dynamic interactions among key components, (iii) knockout analysis to identify parameters essential for sustained rhythmicity, and (iv) network impact analysis using composite weighted network indices to evaluate hierarchical control across the network. Together, these analyses reveal that transcriptional repression, protein degradation, and light-regulated synthesis form the dominant control mechanisms within the circadian system. The results highlight a hierarchical and robust network structure centred on the CCA1/LHY and PRRs feedback loop, with redundant modules ensuring stability under perturbations. Thus, this model provides an improved, biologically consistent framework for dissecting the dynamic architecture of the plant circadian clock and guiding future experimental validation.

3
Quantifying the effect of cereal plant trait plasticity on weed suppression in intercrops

Kottelenberg, D. B.; Morales, A.; Anten, N. P. R.; Bastiaans, L.; Evers, J. B.

2026-04-03 plant biology 10.64898/2026.04.01.715874 medRxiv
Top 0.1%
22.0%
Show abstract

In cereal-legume intercrops, weed suppression is primarily driven by cereals, whose competitiveness is shaped by trait plasticity--morphological adjustments in response to the intercrop environment. However, how individual cereal traits respond plastically and contribute to system performance remains unclear, hampering improvements through breeding or system design. We combined field experiments with functional-structural plant modelling to quantify plastic responses of four cereal traits (tiller number, tiller angle, specific leaf area (SLA), and specific internode length (SIL)) and their effects on weed suppression and crop productivity. Field measurements revealed plasticity in tiller number, tiller angle, and SIL between sole crops and intercrops, while SLA showed minimal differences. Simulations showed that intermediate tiller numbers resulted in the strongest weed suppression and highest productivity, indicating an optimum, while more horizontal tillers suppressed weeds slightly better than vertical ones. Weed suppression increased with higher SLA values, while SIL showed a saturating response, increasing to intermediate SIL values and plateauing thereafter. In simulations with short-statured cereal phenotypes (low SIL), the reduction in cereal weed suppression was compensated by the legume component. This study demonstrates how FSP modelling can be used to investigate trait plasticity mechanisms and generate testable hypotheses about trait effects in complex intercrop systems. HighlightCereal trait plasticity shapes weed suppression in cereal-legume intercrops, with distinct response patterns per trait, while legumes can compensate for weakly competitive cereals, suggesting balanced competition over cereal dominance.

4
Secondary Growth and Exodermal Barriers Shape Local Root Hydraulics: Modeling Insights in Tomato

D'Agostino, M.; Schoppach, R.; Heymans, A.; Couvreur, V.; Lobet, G.

2026-01-29 plant biology 10.64898/2026.01.27.701735 medRxiv
Top 0.1%
21.9%
Show abstract

Root water uptake efficiency depends on root system architecture and anatomical features of individual root segments. Beyond cell wall, membrane, and plasmodesmata hydraulic properties, root anatomy critically influences profiles of radial conductivity and axial conductance. While these structural factors have been well-characterized in monocotyledons, their role in dicotyledons--where developmental anatomy, secondary growth, and hydrophobic barrier dynamics differ--remains poorly understood. Here, we integrate structural and functional models to assess how dicotyledon-specific anatomy, hydrophobic depositions (suberin/lignin in exo-/endodermis), and aquaporin contribution influence root hydraulics. Using tomato (Solanum lycopersicum L., cv. Moneymaker) as a dicotyledon model, our simulations show that: - Exodermal suberin has negligible effects on radial conductivity when a lignin cap is present, and exodermal barriers are less effective than endodermal ones. - Secondary growth and dicotyledon-specific anatomy are essential for sustaining high axial conductance, ensuring efficient water uptake across soil profiles and maintaining root system hydraulic conductance.

5
GE-BiCross: A Hierarchical Bidirectional Cross-Attention Framework for Genotype-by-Environment Prediction in Maize

Zhou, S.; Zhao, T.

2026-03-12 bioinformatics 10.64898/2026.03.10.710816 medRxiv
Top 0.1%
14.7%
Show abstract

Genotype-by-environment interactions are central to crop adaptation and yield stability, yet they remain difficult to model for robust prediction across heterogeneous environments. Although enviromic profiling has improved the characterization of dynamic field conditions, most existing genomic prediction methods adopt a late-fusion strategy that encodes genomic and environmental information independently before global integration, thereby limiting their ability to resolve fine-scale, context-dependent G x E effects. Here, we developed GE-BiCross, a hierarchical bidirectional cross-attention framework for maize prediction. GE-BiCross incorporates a dual-path feature extraction module to disentangle independent and cooperative effects, a tokenized bidirectional cross-attention module to enable reciprocal genotype-environment interaction learning, and a mixture-of-experts module to adaptively capture heterogeneous response patterns across environments. Using a large-scale dataset of approximately 360,000 observations from 4,923 maize hybrids evaluated in 241 environments, GE-BiCross consistently outperformed conventional genomic prediction, machine learning, and deep learning baselines across six agronomic traits. The greatest improvements were observed for environmentally responsive and genetically complex traits. In particular, GE-BiCross achieved an R2 of 0.672 for grain yield and 0.880 for grain moisture, significantly surpassing all comparison models. Ablation analyses demonstrated that the three core modules make distinct and complementary contributions to predictive performance.These results show that deep, bidirectional integration of genomic and enviromic information can substantially improve modeling of complex G x E interactions, providing a powerful framework for interpretable genomic prediction and climate-smart crop breeding.

6
How important is the intra-regional soil heterogeneity for the design of future stress-avoidant wheat ideotypes? A modeling study in central France

Blanchet, G.; Semenov, M. A.; Allard, V.

2026-02-12 plant biology 10.64898/2026.02.11.705307 medRxiv
Top 0.1%
12.7%
Show abstract

Accurate projections of crop adaptation to climate change require accounting for the spatial heterogeneity of soils, which modulates both water availability and the effectiveness of genetic adaptation. Using the process-based crop model Sirius, we investigated how intra-regional variability in soil available water capacity (AWC) influences wheat yields and the adaptive value of stress-avoidant ideotypes under future climates in central France (Limagne plain). Detailed soil databases were aggregated across five representative sites and combined with multiple climate projections (CMIP6), two emission pathways (SSP2-4.5 and SSP5-8.5), and three time horizons (2031-2050, 2051-2070 and 2071-2090). Variance decomposition revealed that soil AWC accounted for 23% of the simulated yield variability, significantly exceeding the contribution of local climate contrasts (10%), a pattern consistent across current and future periods. Deep soils (>80 mm AWC) buffered drought effects whereas yields stagnated in shallow soils (<80 mm AWC) where water deficits persisted despite phenology hastening. On average, the reference cultivar showed earlier anthesis by 8-21 days under future climates, leading to higher yields mainly in deep soils. Optimization of flowering timing through stress-avoidant ideotypes provided mean yield gains of +6.33 dt{middle dot}ha-1 in deep soils, but limited benefits (+1.71 dt.ha-1) in shallow ones, highlighting pedological dependence of breeding efficiency. Advancing anthesis also increased exposure to early-spring frost: frost probability rose from <0.1 to >0.4 when flowering occurred more than 250 {degrees}C.days earlier, particularly in the frost-prone part of the study area. Hence, frost risk remains a critical constraint for early ideotypes, even under strong warming. Overall, our results demonstrate that intra-regional soil heterogeneity remains a dominant driver of wheat yield variability and adaptation potential under climate change. Designing stress-avoidant ideotypes without explicit consideration of local soil AWC could lead to maladaptation, especially in regions with shallow soils represent a significant portion of cropped areas. In such situation, breeding for terminal stress avoidance may offer only limited benefit. We advocate that breeding and modeling frameworks integrate high-resolution soil data to refine regional ideotype design, reconcile terminal-stress avoidance with frost tolerance, and better capture the spatial realism required for sustainable crop adaptation strategies. Highlights- Local soil water capacity limits wheat adaptation to climate change. - Deep soils favor earlier, stress-avoidant ideotypes. - Shallow soils restrict the benefits of phenological adjustment for stress avoidance. - Frost exposure remains a key risk when shifting phenology toward earliness.

7
Joint modeling of social genetic effects in mono- and pluri-specific groups: case study in intercrops

Salomon, J.; Enjalbert, J.; Flutre, T.

2026-03-31 genetics 10.64898/2026.03.27.714849 medRxiv
Top 0.1%
10.1%
Show abstract

The genetics of interspecific groups remains largely unexplored, despite the central role of social (or indirect) genetic effects in shaping phenotypic expression within communities. Intercropping, i.e. the simultaneous cultivation of multiple crop species in the same field, offers a powerful model to harness these interspecific social effects. Such species mixtures provide well-documented agricultural benefits, yet few breeding frameworks have integrated the genetics of social interactions. Here, we address this gap by extending quantitative genetic theory to interspecific groups, with intercropping as a concrete and applied model case. We propose a quantitative genetic model that jointly analyzes intra and interspecific interactions within a unifying framework. Breeding values are decomposed into a direct component, shared in mono and mixed-crops, an interspecific social component corresponding to the effect of one species on another, and an intraspecific component that captures the social effects within a mono-genotypic stand of cloned plants. Statistically, this consists in simultaneously fitting several linear mixed models, one per stand type, all having direct breeding values in common. As no open-source software can fit such a complex mixed model, we provide such an implementation in R/C++. Simulations across various genetic (co)variance structures and sparse experimental designs showed accurate estimation of all genetic (co)variances and breeding values. With an incomplete, yet balanced design combining sole crops and intercrops, genetic gains in both systems were achievable simultaneously, enabling breeding strategies that progressively integrate intercropping into existing, sole-crop-only schemes. More broadly, this framework allows dissecting direct and social genetic effects when genotypes are observed in mono- and mixed-species situations, cultivated or not.

8
Epistatic fitness landscapes emerge from parallel adaptive walks in breeding network metapopulations

Monyak, T.; Morris, G.

2026-03-20 genetics 10.64898/2026.03.18.712732 medRxiv
Top 0.1%
9.1%
Show abstract

Global networks of crop breeding programs leverage diverse germplasm, but diversity increases the complexity of maintaining stability in their elite genepools. To characterize genetic heterogeneity in breeding metapopulations and develop insights on how to manage it, we simulated the evolution of breeding populations on fitness landscapes. We revealed the geometric decrease in the average effect size of alleles segregating as standing variation that become fixed along an adaptive walk. We also demonstrated how independent adaptive walks of subpopulations are influenced by genetic drift, leading to cryptic genetic heterogeneity among elite genepools. This variation is released when elite lines derived from independent subpopulations are crossed, leading to segregation for 2-4X more major QTL in admixed families as in unadmixed families, and 2-4X more epistatic interactions. The emergent property of fitness epistasis for traits under stabilizing selection is well-understood in evolutionary genetics, but under-appreciated in crop quantitative genetics. To highlight the importance of this phenomenon, we constructed an empirical genotype-to-fitness landscape from the sorghum NAM, a global admixed prebreeding resource, demonstrating the utility of fitness landscapes for inferring genetic compatibilities within metapopulations. Our findings suggest that in breeding networks, strategies for effective germplasm exchange must account for epistasis in the oligogenic component of the genetic architecture of locally-adapted traits. Article summaryModern public sector crop improvement happens in networks of breeding programs that routinely exchange genetic information. Traditional models for understanding quantitative traits have limited predictiveness in situations with such genetic heterogeneity. This study uses breeding simulations and empirical data to show the utility of the fitness landscape framework for characterizing the genetic architecture of complex traits in breeding metapopulations. By simulating the evolution of breeding programs and integration into networks, it demonstrates how epistatic interactions between large-effect alleles are a fundamental property that must be accounted for when exchanging germplasm. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=102 SRC="FIGDIR/small/712732v1_ufig1.gif" ALT="Figure 1"> View larger version (25K): org.highwire.dtl.DTLVardef@1541326org.highwire.dtl.DTLVardef@b553a8org.highwire.dtl.DTLVardef@8758b4org.highwire.dtl.DTLVardef@1d0bdcd_HPS_FORMAT_FIGEXP M_FIG C_FIG

9
BioOS: A Gene-Driven Digital Twin Runtime for Emergent Plant Development

AUGER, E.; Gandecki, M.; Delarche, C.; Heng, F. X.

2026-03-17 bioinformatics 10.64898/2026.03.14.711542 medRxiv
Top 0.1%
8.3%
Show abstract

Predicting plant mutant phenotypes requires models that connect gene regulation to organ-scale morphogenesis without collapsing mechanism into phenomenological rules. We present BioOS, a curated mechanistic runtime built around the Formal Cell, a minimal signal-processing abstraction in which promoter evaluation, transcription, translation, and protein state drive cell division, differentiation, and elongation. A multi-scale architecture combining TissueUnits and FormalCells enables real-time simulation of bounded Arabidopsis thaliana developmental programs while keeping the primary claim anchored in primary-root auxin transport. On the official five-case root-auxin benchmark, BioOS achieves a 75.4% mean score, 5/5 qualitative matches, 5/5 quantitative passes, and Spearman severity correlation {rho} = 0.70. The deployed auxin slice uses a curated 35-gene registry; for readability, this manuscript details an 18-gene core subnetwork. Beyond the primary auxin claim, the same runtime closes official cytokinin (5/5), flowering (5/5), and photosynthesis (7/7) gates, while a candidate root-patterning panel passes 8/8. BioOS should therefore be read as a benchmark-validated runtime for bounded developmental prediction rather than as a single-slice demonstration. O_TEXTBOXKey Resulty ResultsThe following results summarize the current validation status of BioOS on the primary-root auxin slice and its surrounding benchmark framework: O_LI35-gene root-auxin runtime, with an 18-gene core GRN illustrated here C_LIO_LIEmergent division, differentiation, elongation from gene expression C_LIO_LIOfficial auxin panel closed: 5/5 qualitative matches, 5/5 full passes, 75.4% mean score, {rho} = 0.70 severity ranking C_LIO_LIFour official gates closed in one runtime: root_auxin 5/5, cytokinin 5/5, flowering 5/5, photosyn-thesis 7/7 C_LIO_LIBroader benchmark corpus: 6 suites / 63 cases; the 8-case candidate root-patterning panel currently passes 8/8 C_LIO_LIReal-time capable: 175 TissueUnits + 200 cells = 8 ms/tick C_LI C_TEXTBOX

10
Inter-variety competition dynamics in US inbred and hybrid maize

Schulz, A. J.; Bohn, M. O.; Bradbury, P.; Lima, D. C.; De Leon, N.; Flint-Garcia, S.; Holland, J. B.; Lepak, N.; Lorenz, A. J.; Romay, M. C.; Hirsch, C. N.; Buckler, E. S.; Robbins, K. R.

2026-02-28 plant biology 10.64898/2026.02.26.708322 medRxiv
Top 0.1%
8.3%
Show abstract

Variety mixtures provide a potential avenue in US cropping systems to improve yield stability and disease resistance. However, implementation of variety mixtures requires an understanding of the competitive dynamics of the crop. In this study, we examine the effects of plant competition both between and within plots through five unique experiments: 1) 5,000 diverse inbred lines in single-row plots, 2) hybrids in two-row plots developed from the above inbred lines, 3) over 4,000 hybrids measured in 141 locations in two-row plots as part of Genomes to Fields, 4) mixtures of two hybrids within a two-row plot planted across two years and five locations, and 5) mixtures of up to twenty hybrids in four-row plots in three locations. Across all experiments, we find that competitive interactions are extremely limited. Within inbred lines, height of the neighboring plot accounts for 1.2% of the variance in focal plot height. Similarly, neighbor height explains 1.7% of the variance in focal plot yield in hybrids developed from the inbred lines. The genetics of neighboring plots explains 1.55% of the variation in yield across 141 location-year environments, reinforcing the generally modest impacts of neighbor competition. In evaluating mixtures of hybrids in both two and four-row plots, we observe no yield penalty compared to conventional single hybrid plots, even with large height differentials of the hybrids included in the mixture or in mixtures of up to 20 hybrids within a plot. Finally, we observe that mixtures have more yield stability compared to conventional plots, highlighting a new avenue for increased stability in higher risk environments. The lack of yield penalty and stability benefits are promising for future investigations of mixtures that may complement each other in disease resistance or abiotic stress tolerance and increase overall yield stability in the field.

11
Bridging human and plant adaptations for climate resilience

Favretto, N.; Tan, H. L.; Brain, G.; Ezer, D.

2026-02-23 plant biology 10.64898/2026.02.20.706989 medRxiv
Top 0.1%
6.2%
Show abstract

O_LIClimate change is reshaping agriculture through both gradual shifts and increasingly unpredictable extremes. Plants cope using developmental plasticity and bet-hedging, but it is unclear how these biological strategies align with the ways farmers perceive and respond to climate risks. This study investigates: (1) whether farmers understand climate change as incremental trends or recurrent shocks, (2) how their adaptations parallel plant plasticity and bet-hedging, and (3) under which climate scenarios these adaptations best support yield stability. C_LIO_LIWe combined qualitative research and modelling by conducting fifty semi-structured interviews with farmers, agricultural associations and public administrators across three climatically distinct Italian regions, and by developing an agent-based stochastic simulation that represents farmer-like plasticity (delayed sowing) and bet-hedging (staggered sowing) under drought and flood scenarios. C_LIO_LIFarmers described climate change as both gradual transformation and intensifying volatility. Their adaptive responses - adjusting calendars, switching crops and diversifying production - closely aligned with plant strategies, though articulated in practical rather than scientific terms. Simulation results showed that plasticity enhanced yields under systematic shifts in conditions, whereas bet-hedging reduced losses in highly variable climates characterised by frequent transitions between extremes. C_LIO_LITogether, the qualitative and modelling findings demonstrate that plant and farmer adaptation logics converge in complementary ways. Plasticity supports performance under gradual change, while bet-hedging buffers unpredictability. These insights highlight the potential for co-designed tools that link plant traits, farmer decision-making and ecological risk, strengthening climate-resilient agricultural planning and improving communication between farmers, breeders and plant scientists. C_LI Societal Impact StatementClimate change is transforming agriculture through both gradual shifts and increasingly unpredictable extremes, challenging farmers ability to protect crops and livelihoods. This study brings together farmer experiences and plant adaptation strategies to explore how people and plants respond to similar climate pressures. By showing that farmers practices mirror plant plasticity and bet-hedging, our findings highlight opportunities to design climate-resilient agriculture that aligns biological traits with real-world decision-making. This work can inform plant breeders, extension services and policymakers seeking to support farmers through clearer communication, better risk-management tools and more adaptable crop varieties, ultimately strengthening resilience in food systems.

12
Radiation-Driven Prediction of Daily Irrigation Demand under Different Electrical Conductivity Scenarios in Greenhouse Tomato

Xiao, L.

2026-01-24 plant biology 10.64898/2026.01.23.701235 medRxiv
Top 0.1%
4.9%
Show abstract

In soilless greenhouse tomato cultivation, daily transpiration and irrigation demand are largely governed by solar radiation, while irrigation-solution electrical conductivity (EC) used for salinity management may further modulate plant water use. This study developed a low-input, radiation-driven modeling approach to predict daily irrigation demand under contrasting water-salt management scenarios. Two tomato cultivars were grown under four treatments: conventional baselines (CK1, CK2) and regulated scenarios combining irrigation volume with solution EC (low-water high-EC, TK; high-water moderate-EC, TC). Daily irrigation volume (I) and drainage were recorded, and daily cumulative radiation (G) was derived from photosynthetically active radiation (PAR). Within each treatment, we compared a radiation-only baseline model with an EC-adjusted model and evaluated predictive performance using 5-fold blocked time-series cross-validation. Results showed strong positive correlations between G and I across all treatments (p < 0.001). The EC-adjusted models achieved cross-validated root-mean-square errors (RMSE) of 0.815-1.393 L d-1 per trough and Nash-Sutcliffe efficiencies (NSE) of 0.407-0.730. Incorporating EC yielded a small but consistent improvement under the TK scenario ({Delta}RMSE = -0.014 L d-1; {Delta}NSE = +0.019), whereas its effect was negligible or slightly negative under CK1, CK2, and TC, highlighting scenario dependence. Our radiation-driven framework, with an optional EC correction, offers a practical and scalable tool for daily irrigation forecasting and supports integrated water-salt management in soilless greenhouse tomato production.

13
Uncovering genetic mechanisms underlying trait variation in switchgrass using explainable artificial intelligence

Izquierdo, P.; Weng, X.; Juenger, T.; Bonnette, J. E.; Yoshinaga, Y.; Daum, C.; Lipzen, A.; Barry, K.; Blow, M. J.; Lehti-Shiu, M. D.; Lowry, D.; Shiu, S.-H.

2026-03-09 genetics 10.64898/2026.03.06.710154 medRxiv
Top 0.1%
4.7%
Show abstract

Uncovering the genetic architecture of quantitative traits is challenging because polygenic control yields small individual gene effects and because gene-gene and genotype-by-environment interactions add further complexity. To understand the genetic basis of polygenic traits and their plasticity across environments, we integrated genome-wide SNPs and RNA-seq transcript data with interpretable statistical and machine learning models in a switchgrass (Panicum virgatum) diversity panel grown at contrasting field sites in Michigan and Texas. Notably, in addition to single environments, our trait prediction models were able to predict phenotypic differences, across environments i.e., plasticity. By interpreting trait prediction models with explainable artificial intelligence methods, we identified important features--genes that are the most predictive of flowering time and annual biomass production across environments, based on their associated gene expression levels and nearby SNPs. This approach recovered canonical flowering regulators and revealed novel, environment-specific candidate flowering genes. Further, transcriptome models consistently recovered more switchgrass genes homologous to experimentally validated genes in Arabidopsis and rice than SNP-based models. Feature interaction scores from the models also allow the identification of trait- and environment-dependent gene-gene interactions, where flowering time showed stronger and more abundant interactions than biomass. While some of the interactions identified are consistent with the link between flowering time and yield, most are novel predictors that need to be further evaluated. Together, these results demonstrate that interpretable genomic prediction with explainable artificial intelligence approaches can convert trait prediction models into mechanistic hypotheses about putative causal genes and interactions controlling traits within and across environments. These results will help to prioritize target genes for validation and inform germplasm selection for cultivar improvement.

14
Improved Ensemble Performance by Weight Optimisation for the Genomic Prediction of Maize Flowering Time Traits

Tomura, S.; Powell, O. M.; Wilkinson, M. J.; Lefevre, J.; Cooper, M.

2026-02-06 bioinformatics 10.64898/2026.02.03.703660 medRxiv
Top 0.1%
4.4%
Show abstract

Ensembles of multiple genomic prediction models have demonstrated improved prediction performance over the individual models contributing to the ensemble. The outperformance of ensemble models is expected from the Diversity Prediction Theorem, which states that for ensembles constructed with diverse prediction models, the ensemble prediction error becomes lower than the mean prediction error of the individual models. While a naive ensemble-average model provides baseline performance improvement by aggregating all individual prediction models with equal weights, optimising weights for each individual model could further enhance ensemble prediction performance. The weights can be optimised based on their level of informativeness regarding prediction error and diversity. Here, we evaluated weighted ensemble-average models with three possible weight optimisation approaches (linear transformation, Nelder-Mead and Bayesian) using flowering time traits from two maize nested associated mapping (NAM) datasets; TeoNAM and MaizeNAM. The three proposed weighted ensemble-average approaches improved prediction performance in several of the prediction scenarios investigated. In particular, the weighted ensemble models enhanced prediction performance when the adjusted weights differed substantially from the equal weights used by the naive ensemble models. For performance comparisons within the weighted ensembles, there was no clear superiority among the proposed approaches in both prediction accuracy and error across the prediction scenarios. Weight optimisation in ensembles warrants further investigation to explore the opportunities to improve their prediction performance; for example, integration of a weighted ensemble with a simultaneous hyperparameter tuning process may offer a promising direction for further research.

15
Branching Varies with Light Limitation Scenarios in relation with Changes in Carbon Source-Sink Dynamics.

Schneider, A.; Boudon, F.; Demotes-Mainard, S.; Ledroit, L.; Perez-Garcia, M.-D.; Cassan, C.; Gibon, Y.; Godin, C.; Sakr, S.; Bertheloot, J.

2026-01-29 plant biology 10.64898/2026.01.27.702021 medRxiv
Top 0.1%
4.0%
Show abstract

Bud outgrowth is a major component of plant architectural plasticity and is influenced by light conditions. While the inhibitory effect of low light intensity on branching is well documented, the underlying regulators remain debated and, especially, the role of sugar availability has never been thoroughly evaluated. Here, we combined experiments with a computational approach quantifying carbon source-sink balance in single-axis rose plants to investigate how continuous and transient light limitation regulate bud outgrowth. Continuous low light reduced photosynthesis, leading to decreased sugar availability and inhibited bud outgrowth. In contrast, a transient period of low light followed by high light unexpectedly stimulated bud outgrowth, shortened the delay between outgrowth of successive buds, and produced an over-branched phenotype. This response resulted from a non-reversible reduction in the growth of apical organs appearing under low light, which lowered carbon demand and caused sugar over-accumulation after the return to high light. Manipulating carbon supply and demand through leaf masking, photosynthetic inhibition, and targeted sucrose feeding causally confirmed the central role of sugar availability in these contrasting responses. Beyond these findings, key requirements for models simulating branching plasticity were identified and this work provides a basis for predicting branching responses under fluctuating and complex light environments. HighlightBud outgrowth, a key component of plant plasticity, is regulated by light intensity through sugar availability. Continuous and transient low light have opposite effects by limiting sugar production and use, respectively.

16
Robust Random Forests for Genomic Prediction: Challenges and Remedies

Lourenco, V. M.; Ogutu, J. O.; Piepho, H.-P.

2026-04-01 bioinformatics 10.64898/2026.03.30.715203 medRxiv
Top 0.1%
3.5%
Show abstract

Data contamination--from recording errors to extreme outliers--can compromise statistical models by biasing predictions, inflating prediction errors, and, in severe cases, destabilizing performance in high-dimensional settings. Although contamination can affect responses and covariates, we focus on response contamination and evaluate Random Forests through simulation. Using a synthetic animal-breeding dataset, we assess robust Random Forests across several contamination scenarios and validate them on plant and animal datasets. We thereby clarify the consequences of contamination for prediction, develop a robust Random Forest framework, and evaluate its performance. We examine preprocessing or data-transformation strategies, algorithmic modifications, and hybrid approaches for robustifying Random Forests. Across these approaches, data transformation emerges as the most effective strategy, delivering the strongest performance under contamination. This strategy is simple, general, and transferable to other Machine Learning methods, offering a remedy for robust genomic prediction. In real breeding data, robust Random Forests are useful when substantial contamination, phenotypic corruption, misrecording, or train-deployment mismatch is plausible and the goal is to recover a latent signal for genomic prediction and selection; ranking-based robust Random Forests are the dependable first option, whereas weighting-based Random Forests should be used only when their weighting scheme preserves rank structure and improves prediction. Robustification is not universally necessary, but it becomes important when contamination distorts the link between observed responses and the predictive target; standard Random Forests remain the default for clean data, whereas robust Random Forests should be fitted alongside them whenever contamination is plausible, with the final choice guided by data, trait, and breeding objective. Author summaryMachine learning (ML) methods are widely used for prediction with high-dimensional, complex data, and supervised approaches such as Random Forests (RF) have proved effective for genomic prediction (GP) and selection. Yet their performance can be severely compromised by data contamination if the algorithms rely on classical data-driven procedures that are sensitive to atypical observations. Robustifying ML methods is therefore important both for improving predictive performance under contamination and for guiding their practical use in high-dimensional prediction problems. To address this need, we develop robust preprocessing, algorithm-level, and hybrid strategies for improving RF performance with contaminated data. Using simulated animal data, we show that ranking-and weighting-based robust RF provide the strongest overall compromise for genomic prediction and selection under contamination. Validation on several plant and animal breeding datasets further shows that the benefits of robustification are not universal, but depend on the dataset, trait, and breeding objective. Although motivated by RF, the framework we propose is general, practical, and readily transferable to other ML methods. It also offers a basis for deciding when robustness should complement standard RF rather than replace it outright.

17
Dissecting oligogenic and polygenic indirect genetic effects through the lens of neighbor genotypic identity

Sato, Y.; Hamazaki, K.

2026-04-03 genetics 10.64898/2026.03.31.715746 medRxiv
Top 0.1%
3.5%
Show abstract

Individual phenotypes often depend on the genotypes of other individuals within a group. These phenomena are termed indirect genetic effects (IGEs) and have been distinguished from direct genetic effects (DGEs) using quantitative genetic models. Recent studies have utilized high-resolution polymorphism data to enable genomic prediction (GP) and genome-wide association study (GWAS) of IGEs, but unified methods remain limited. Here we integrate polygenic and oligogenic IGEs using a multi-kernel mixed model incorporating two random effects with a single covariance parameter. Underlying this implementation, the Ising model of ferromagnetics enabled us to simplify locus-wise and background IGEs for GWAS and GP, respectively. Our simulations demonstrated that, while the previous and present models exhibited similar performance, the present model can infer a trade-off between DGEs and IGEs. By applying this method to three species of woody plants, we found evidence for intergenotypic competition in aspen and apple trees, but limited evidence in climbing grapevines. Based on GWAS, we also detected significant variants associated with the competitive IGEs on the apple trunk growth. Our study offers a flexible implementation for GWAS/GP of IGEs, thereby providing an effective tool to dissect the genetic architecture of group performance.

18
Dissecting genetic variance structure and evaluating genomic prediction models for single-cross hybrids derived from Stiff Stalk and Non-Stiff Stalk maize heterotic groups

Godoy, J. C.; Edwards, J.; Lee, E. C.; Mikel, M. A.; Fernandes, S. B.; Hirsch, C. N.; Berry, S. P.; Lipka, A. E.; Bohn, M. O.

2026-03-13 genetics 10.64898/2026.03.11.710575 medRxiv
Top 0.1%
2.6%
Show abstract

The early 20th-century discovery of heterosis and the establishment of heterotic groups transformed maize (Zea mays L.) into a keystone of global agriculture. However, maize breeding faces two significant challenges: the gradual decline of general combining ability (GCA) variance within heterotic groups and the impracticality of testing all possible single crosses in the early stages of a breeding program. Here, we developed genomic best linear unbiased prediction (GBLUP)-based multi-kernel models, using additive and two alternative non-additive genomic relationship matrices, to estimate the variance components associated with the GCA of Stiff Stalk (SS) and Non-Stiff Stalk (NSS) heterotic groups and the specific combining ability (SCA) arising from their crosses. We further applied these models to predict the performance of untested single-cross combinations under varying levels of parental information. We showed that the SS and NSS groups retained significant GCA variance across traits in both early- and late-maturity groups. The SS group, in contrast, exhibited no detectable GCA variance in grain yield for the intermediate-flowering subset of hybrids, highlighting a limitation for future genetic improvement. Furthermore, our results showed that GBLUP-based multi-kernel models effectively identified superior hybrids when parental information was available. In the absence of this information, however, these models underperformed compared to covariance-based approaches. Both types of non-additive matrices produced similar results, affirming the robustness of the inferred genetic architecture. Overall, this study sheds light on the future use of US maize commercial germplasm and demonstrates how GBLUP-based multi-kernel models can improve the efficiency of hybrid breeding programs.

19
In vivo validation of predicted fitness effects at single-base resolution in a Brachypodium distachyon mutant population

Moslemi, C.; Folgoas, M.; Yu, X.; Jensen, J. D.; Hentrup, S.; Li, T.; Wang, H.; Boelt, B.; Asp, T.; Sibout, R.; Ramstein, G. P.

2026-04-02 genomics 10.64898/2026.03.31.715642 medRxiv
Top 0.1%
2.4%
Show abstract

Computational tools, including biological language models (LMs), show substantial promise in predicting the impact of genetic variants on plant fitness. However, validating variant effect predictions (VEP) requires experimental populations where genetic variation consists of discrete point mutations rather than segregating recombination blocks. In this study, we generated a novel population of Brachypodium distachyon mutant lines to evaluate the accuracy of VEP at single-base resolution. These lines were advanced through single-seed descent for five generations (M1 to M5), with whole-genome sequencing performed at M2 and M5 and phenotypic measurements recorded at M3 and M4. Using state-of-the-art VEP models, we predicted the functional impact of missense protein-coding variants and gene-proximal non-coding variants. We validated these predictions by estimating the effect of mutations on whole-plant measurements (burden tests) and their probability of fixation from M2 to M5 (purging tests). Among missense variants, the protein LM ESM showed superior predictive accuracy compared to the bioinformatic standard SIFT and the genomic LM PlantCAD. Notably, the relationship between VEP scores and allele fixation suggested a log-linear relationship between VEP scores and variant fitness. Among gene-proximal variants, PlantCAD appeared more accurate than supervised models of regulatory activity, such as chromatin accessibility (a2z) and RNA abundance (PhytoExpr). Collectively, our findings highlight the utility of state-of-the-art VEP tools as predictors of fitness and demonstrate the potential of mutant populations to evaluate computational tools for precision breeding applications.

20
Ethylene biosynthesis in guard cells, not mesophyll, predominantly drives stomatal conductance responses to CO2

Roda, D. N.; Shapira, O.; Neta, D.; Gal, S.; Shemer, T. A.

2026-03-06 plant biology 10.64898/2026.03.05.708972 medRxiv
Top 0.1%
2.1%
Show abstract

O_LIResearch and rationale: This study investigates whether tissue-specific ethylene biosynthesis regulates stomatal conductance (gs) responses to changing [CO2] in Arabidopsis thaliana. While guard cells sense [CO2], mesophyll-derived signals are also implicated in stomatal control. We aimed to determine if ethylene production in guardcells or mesophyll is the primary driver of CO2-induced gs regulation. C_LIO_LIMethods: An acs octuple mutant with severely reduced ethylene production was complemented with tissue-specific ACS8/ACS11 transgenes driven by guard-cell, spongy-mesophyll, dual palisade/spongy-mesophyll, or whole-leaf promoters. Tissue-specific complementation in the different transgenic lines was confirmed and evaluated by qPCR, tissue-specific NEON expression, microscopic imaging, and ethylene production measurements. Gas-exchange measurements on intact plants recorded gs kinetics, CO2 assimilation, and water-use efficiency, across CO2 shifts. C_LIO_LIKey results: Guard-cell complementation nearly fully restored wild-type gs responses and reversed the mutants aberrant leaf phenotype. Spongy-mesophyll complementation failed to rescue either trait, while dual palisade- and spongy-mesophyll complementation yielded only partial recovery. C_LIO_LIConclusion: Ethylene produced in guard cells is the dominant regulator of CO2-induced stomatal conductance regulation, with mesophyll-derived ethylene contributing secondarily via long-distance signaling or by augmenting the overall ethylene pool. These findings underscore the importance of spatially regulated ethylene biosynthesis in balancing carbon assimilation and transpiration. C_LI